Skip to content

feat: allow for custom run.yaml at container runtime#84

Merged
mergify[bot] merged 1 commit intoopendatahub-io:mainfrom
nathan-weinberg:entrypoint
Nov 13, 2025
Merged

feat: allow for custom run.yaml at container runtime#84
mergify[bot] merged 1 commit intoopendatahub-io:mainfrom
nathan-weinberg:entrypoint

Conversation

@nathan-weinberg
Copy link
Copy Markdown
Collaborator

@nathan-weinberg nathan-weinberg commented Oct 16, 2025

What does this PR do?

this commit changes the server start command to be dynamically generated based on Llama Stack version

it allows users to pass a custom run.yaml if they so choose will keeping the official run.yaml we ship with the distro image as a default

Closes #83

Summary by CodeRabbit

  • New Features

    • Container startup now supports flexible run configuration, allowing custom run files or configuration identifiers to be specified at launch.
  • Documentation

    • Added guidance and examples for running the distribution image with custom configuration files (including notes about dependency compatibility).

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Oct 16, 2025

Walkthrough

Replaces the container ENTRYPOINT with a new executable script and adds that script to the image; the script chooses the llama stack run target from RUN_CONFIG_PATH, DISTRO_NAME, or a bundled /opt/app-root/run.yaml. README gains instructions for running with a custom run YAML.

Changes

Cohort / File(s) Summary
Containerfile updates
distribution/Containerfile, distribution/Containerfile.in
Copy distribution/entrypoint.sh into the image at /opt/app-root/entrypoint.sh, set executable permissions, create ${HOME}/.llama and ${HOME}/.cache, preserve copy of run.yaml, and change ENTRYPOINT from ["llama","stack","run","/opt/app-root/run.yaml"] to ["/opt/app-root/entrypoint.sh"].
Entrypoint script
distribution/entrypoint.sh
New shell entrypoint that: exits on error; if RUN_CONFIG_PATH is set and the file exists, runs llama stack run "$RUN_CONFIG_PATH"; else if DISTRO_NAME is set, runs llama stack run "$DISTRO_NAME"; otherwise runs llama stack run /opt/app-root/run.yaml, forwarding additional args.
Documentation
README.md
Adds two "Running with a custom run YAML" sections with podman run examples and notes about dependency alignment when using a custom YAML.

Sequence Diagram(s)

sequenceDiagram
    participant Container as Container Start
    participant Entrypoint as /opt/app-root/entrypoint.sh
    participant Llama as llama stack run

    Container->>Entrypoint: exec /opt/app-root/entrypoint.sh "$@"
    alt RUN_CONFIG_PATH set & exists
        Entrypoint->>Llama: llama stack run "$RUN_CONFIG_PATH" "$@"
    else if DISTRO_NAME set
        Entrypoint->>Llama: llama stack run "$DISTRO_NAME" "$@"
    else
        Entrypoint->>Llama: llama stack run /opt/app-root/run.yaml "$@"
    end
    Llama-->>Container: process starts
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Review focus areas:
    • Verify entrypoint.sh conditional logic and quoting/argument forwarding.
    • Confirm file copied in both Containerfile and Containerfile.in with correct permissions.
    • Validate README examples and note about dependency alignment.

Poem

🐰
A rabbit hops to start the run,
Scripts replace the single one.
Paths chosen by env or file,
Startup now is more versatile.
Hop, boom — the stack begins to smile.

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main change: introducing a custom run.yaml capability at container runtime via an entrypoint script mechanism.
Linked Issues check ✅ Passed The PR fully addresses all objectives from issue #83: implements an entrypoint shell script [distribution/entrypoint.sh], supports dynamic startup via environment variables, preserves the shipped run.yaml as fallback, and enables custom run.yaml mounting at runtime.
Out of Scope Changes check ✅ Passed All changes are directly scoped to issue #83 objectives. The entrypoint script, Containerfile updates, and README documentation all align with implementing custom run.yaml support at container runtime.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3e112b5 and 1d07eff.

📒 Files selected for processing (4)
  • README.md (1 hunks)
  • distribution/Containerfile (1 hunks)
  • distribution/Containerfile.in (1 hunks)
  • distribution/entrypoint.sh (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • distribution/entrypoint.sh
🧰 Additional context used
🪛 LanguageTool
README.md

[style] ~69-~69: Using many exclamation marks might seem excessive (in this case: 3 exclamation marks for a text that’s 2062 characters long)
Context: ...o/opendatahub/llama-stack: ``` > [!IMPORTANT] > The distribution image ship...

(EN_EXCESSIVE_EXCLAMATION)


[grammar] ~70-~70: Use a hyphen to join words.
Context: ...There is no guarantee that your custom run YAML will necessarily work with the ...

(QB_NEW_EN_HYPHEN)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-test-push (linux/amd64)
  • GitHub Check: Summary
🔇 Additional comments (3)
README.md (1)

56-70: LGTM! Documentation is clear and helpful.

The new section effectively documents how to run the container with a custom run YAML. The example clearly shows the required volume mount and environment variable, and the IMPORTANT note appropriately warns users about potential dependency incompatibilities—a concern raised during PR discussion.


Note: The static analysis hints about hyphenation ("custom-run YAML") and excessive exclamation marks are style preferences that don't need addressing in technical documentation with callout blocks.

distribution/Containerfile (1)

64-69: LGTM! Auto-generated changes are correct.

The entrypoint script is properly copied with executable permissions (755), and the ENTRYPOINT directive correctly references the absolute path. Since this file is auto-generated (see line 1 warning), the template source in distribution/Containerfile.in is the appropriate place for any changes.

distribution/Containerfile.in (1)

8-10: LGTM! Template changes correctly implement the new entrypoint.

The changes are well-structured:

  • Directory creation for .llama and .cache prepares the runtime environment
  • The entrypoint script is properly copied with executable permissions
  • The ENTRYPOINT directive uses exec form with an absolute path, which is the correct and portable approach

The implementation achieves the PR objective of enabling a dynamic, configurable entrypoint that can accept custom run YAML paths via environment variables while maintaining the default fallback behavior.

Also applies to: 17-17


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
distribution/build.py (1)

55-73: Consider making the version threshold configurable.

The hardcoded threshold version "0.2.23" (line 64) matches the current CURRENT_LLAMA_STACK_VERSION (line 17). When the version is bumped in the future, maintainers must remember to update this threshold if the entrypoint behavior needs to change, creating a maintenance burden.

Consider extracting the threshold as a module-level constant or deriving it from CURRENT_LLAMA_STACK_VERSION:

# Near the top of the file, after CURRENT_LLAMA_STACK_VERSION
ENTRYPOINT_CHANGE_VERSION = "0.2.23"  # Version where entrypoint changed from python -m to llama stack run

def get_entrypoint(llama_stack_version):
    """Determine the appropriate ENTRYPOINT based on llama-stack version."""
    if is_install_from_source(llama_stack_version):
        return 'ENTRYPOINT ["llama", "stack", "run"]'

    try:
        current_version = version.parse(llama_stack_version)
        threshold_version = version.parse(ENTRYPOINT_CHANGE_VERSION)
        # ... rest of function

Additionally, consider narrowing the exception handling on line 70 from except Exception to except version.InvalidVersion to catch only version parsing errors, allowing other unexpected errors to surface properly.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8c752da and 07f27cb.

📒 Files selected for processing (3)
  • distribution/Containerfile (1 hunks)
  • distribution/Containerfile.in (1 hunks)
  • distribution/build.py (5 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-15T14:25:54.837Z
Learnt from: nathan-weinberg
PR: opendatahub-io/llama-stack-distribution#33
File: distribution/Containerfile:17-21
Timestamp: 2025-09-15T14:25:54.837Z
Learning: In the opendatahub-io/llama-stack-distribution repository, the distribution/Containerfile is auto-generated by distribution/build.py based on configuration in build.yaml. When providers are added to build.yaml, the build script automatically regenerates the Containerfile with the required dependencies. Changes to the Containerfile should not be flagged as manual edits if they correspond to legitimate changes in the build configuration.

Applied to files:

  • distribution/Containerfile
  • distribution/Containerfile.in
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-test-push (linux/amd64)
  • GitHub Check: Summary
🔇 Additional comments (3)
distribution/build.py (2)

15-15: LGTM!

The packaging.version import is appropriate for semantic version comparison and ensures correct handling of version strings.


195-216: LGTM!

The integration of the entrypoint parameter into generate_containerfile() and the main flow is correct and complete. The function signature update, template substitution, and call site changes properly implement the dynamic entrypoint feature.

Also applies to: 246-250

distribution/Containerfile (1)

62-63: Critical: Environment variable won't expand in CMD exec form.

Docker does not expand environment variables in JSON array (exec) form. The literal string "${APP_ROOT}/run.yaml" will be passed to the entrypoint instead of the expanded path /opt/app-root/run.yaml, causing the default startup to fail.

Solution: Use the literal path in the JSON array:

 ENTRYPOINT ["llama", "stack", "run"]
-CMD ["${APP_ROOT}/run.yaml"]
+CMD ["/opt/app-root/run.yaml"]

Note: Users can still override the config path at runtime with docker run <image> /custom/path/to/run.yaml, which satisfies the PR objective to allow custom run.yaml files.

⛔ Skipped due to learnings
Learnt from: nathan-weinberg
PR: opendatahub-io/llama-stack-distribution#33
File: distribution/Containerfile:17-21
Timestamp: 2025-09-15T14:25:54.837Z
Learning: In the opendatahub-io/llama-stack-distribution repository, the distribution/Containerfile is auto-generated by distribution/build.py based on configuration in build.yaml. When providers are added to build.yaml, the build script automatically regenerates the Containerfile with the required dependencies. Changes to the Containerfile should not be flagged as manual edits if they correspond to legitimate changes in the build configuration.

@nathan-weinberg nathan-weinberg added the do-not-merge Apply to PRs that should not be merged (yet) label Oct 16, 2025
@nathan-weinberg nathan-weinberg force-pushed the entrypoint branch 2 times, most recently from 2d74b58 to 0382ced Compare October 16, 2025 18:30
@nathan-weinberg nathan-weinberg removed the do-not-merge Apply to PRs that should not be merged (yet) label Oct 16, 2025
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0382ced and a90eb0c.

📒 Files selected for processing (1)
  • README.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
README.md

[style] ~69-~69: Using many exclamation marks might seem excessive (in this case: 3 exclamation marks for a text that’s 2062 characters long)
Context: ...k: \ <path_in_container> ``` > [!IMPORTANT] > The distribution image ship...

(EN_EXCESSIVE_EXCLAMATION)


[grammar] ~70-~70: Ensure spelling is correct
Context: ...es already pre-installed. There is no guarentee that your custom run YAML will nessesar...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[grammar] ~70-~70: Use a hyphen to join words.
Context: ...There is no guarentee that your custom run YAML will nessesarily work with the ...

(QB_NEW_EN_HYPHEN)


[grammar] ~70-~70: Ensure spelling is correct
Context: ...uarentee that your custom run YAML will nessesarily work with the included dependencies.

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[grammar] ~70-~70: There might be a mistake here.
Context: ...ily work with the included dependencies.

(QB_NEW_EN)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-test-push (linux/amd64)
  • GitHub Check: Summary

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
README.md (1)

59-59: Consider more concise wording for clarity.

The phrase "To do so, run the image in the following way. The "path" mentioned should be the path to your custom run YAML file." is slightly verbose. Consider condensing to:

"To use a custom run YAML, run the image with the path to your file as the final argument, mounted as a volume:"

This removes redundancy and gets straight to the instruction.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a90eb0c and 0d59530.

📒 Files selected for processing (1)
  • README.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
README.md

[style] ~69-~69: Using many exclamation marks might seem excessive (in this case: 3 exclamation marks for a text that’s 2062 characters long)
Context: ...k: \ <path_in_container> ``` > [!IMPORTANT] > The distribution image ship...

(EN_EXCESSIVE_EXCLAMATION)


[grammar] ~70-~70: Use a hyphen to join words.
Context: ...There is no guarantee that your custom run YAML will necessarily work with the ...

(QB_NEW_EN_HYPHEN)


[grammar] ~70-~70: There might be a mistake here.
Context: ...ily work with the included dependencies.

(QB_NEW_EN)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-test-push (linux/amd64)
  • GitHub Check: Summary
🔇 Additional comments (1)
README.md (1)

57-70: Documentation addition is clear and well-aligned with PR objectives.

The new section provides users with clear instructions on how to supply a custom run.yaml file, complete with a practical example and an important caveat about dependency compatibility. The spelling corrections from the previous review ("guarantee" and "necessarily") are in place.

However, verify that the example command syntax (<path_in_container> as a positional argument after the image name) reflects the actual container behavior with the new entrypoint mechanism. This assumes the container ENTRYPOINT is a script or binary that accepts the path as a CMD argument—which aligns with the PR objective to decouple the entrypoint from the run.yaml path.

Confirm that:

  1. The new entrypoint script or binary correctly interprets the <path_in_container> argument
  2. The default behavior (when no custom path is provided) still uses /opt/app-root/run.yaml
  3. The container can run without arguments and still work with the default run.yaml

You may find this easier to verify once the container images are built or if you can review the entrypoint implementation (likely in distribution/build.py or a shell script entrypoint).

Copy link
Copy Markdown
Collaborator

@cdoern cdoern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the dependencies we package remain locked: that pretty much only allows for the same set of providers but with different config options or maybe different server level config options, right?

Not sure if this is possible or worth it, but if the deps are locked, it might make sense to validate that people are not changing the providers or introducing a provider with different dependencies. Otherwise this feature will commonly break

@nathan-weinberg
Copy link
Copy Markdown
Collaborator Author

If the dependencies we package remain locked: that pretty much only allows for the same set of providers but with different config options or maybe different server level config options, right?

Correct

Not sure if this is possible or worth it, but if the deps are locked, it might make sense to validate that people are not changing the providers or introducing a provider with different dependencies. Otherwise this feature will commonly break

See the README changes

@cdoern
Copy link
Copy Markdown
Collaborator

cdoern commented Oct 16, 2025

@nathan-weinberg

See the README changes

are you pointing me to the readme where it says There is *no* guarantee that your custom run YAML will necessarily work with the included dependencies.? I think this is kind of a non-answer to my question since I am asking if there is anything we can do to avoid these breakages (which will be frequent).

I am suggesting a simple diff against the provided run.yaml to check if the providers changed and if the new providers require net-new dependencies. This to me, seems like introducing a feature which will more often than not cause failures so some sort of guardrails is advisable IMO.

Let me know if this makes sense and is possible! Thanks

@nathan-weinberg
Copy link
Copy Markdown
Collaborator Author

@nathan-weinberg

See the README changes

are you pointing me to the readme where it says There is *no* guarantee that your custom run YAML will necessarily work with the included dependencies.? I think this is kind of a non-answer to my question since I am asking if there is anything we can do to avoid these breakages (which will be frequent).

Why do we want to avoid them? This is just to let people try their own run YAMLs, we aren't invested into making sure they work. That's why I pointed you here.

I am suggesting a simple diff against the provided run.yaml to check if the providers changed and if the new providers require net-new dependencies. This to me, seems like introducing a feature which will more often than not cause failures so some sort of guardrails is advisable IMO.

Let me know if this makes sense and is possible! Thanks

It's really more of a utility for development - any user trying to actually use this as a production server (either standalone or via the operator) should be using the official run YAML

@cdoern
Copy link
Copy Markdown
Collaborator

cdoern commented Oct 16, 2025

It's really more of a utility for development

Fair, all I am saying that more often than not this'll break without manual manipulation of the dependencies in the container

@nathan-weinberg
Copy link
Copy Markdown
Collaborator Author

It's really more of a utility for development

Fair, all I am saying that more often than not this'll break without manual manipulation of the dependencies in the container

True, but note we are already doing these custom run YAMLs from the operator - note this comment llamastack/llama-stack-k8s-operator#171 (comment) that led to the issue that this PR is resolving. This simply allows us to move the functionality from the operator to the distro image, which is preferred.

We can iterate with some additional checking like you've mentioned, but this is more focused on the movement of that functionality and removing some hardcoded things that really shouldn't be hardcoded.

@kami619
Copy link
Copy Markdown
Collaborator

kami619 commented Oct 16, 2025

@nathan-weinberg do we know what are the common customizations users are expecting to perform by following this path ? For example, I could see the below example you linked, I was imploring to see, if we have any insight into any broader customizations we can test proactively ?

This handles the case where we use the ConfigMap to store the run.yaml, when this happens we need to override the entrypoint to give the path of the run.yaml file.

@rhuss
Copy link
Copy Markdown
Collaborator

rhuss commented Oct 17, 2025

@nathan-weinberg do we know what are the common customizations users are expecting to perform by following this path ? For example, I could see the below example you linked, I was imploring to see, if we have any insight into any broader customizations we can test proactively ?

This handles the case where we use the ConfigMap to store the run.yaml, when this happens we need to override the entrypoint to give the path of the run.yaml file.

The customization is just that you can provide an alternative run.yaml than the default included in the distribution image. For this to work you need (a) to mount a new run.yaml into the container and then (b) provide a startup CMD to point to the run.yaml into the mounted file system. The operator takes care about that, but you can do the same just with vanillad podman, too:

podman run --rm -it \
  --mount type=bind,src="$(pwd)/config",target=/mnt,ro \
  lls-distribution:latest /mnt/run.yaml

assuming you have a ./config/run.yaml prepared.

@nathan-weinberg nathan-weinberg changed the title feat: allow for dynamic server start command and custom run.yaml feat: allow for custom run.yaml at container runtime Nov 4, 2025
@nathan-weinberg nathan-weinberg force-pushed the entrypoint branch 3 times, most recently from 3e112b5 to 8ee6257 Compare November 4, 2025 16:23
@nathan-weinberg nathan-weinberg removed the do-not-merge Apply to PRs that should not be merged (yet) label Nov 4, 2025
@derekhiggins
Copy link
Copy Markdown
Collaborator

lgtm, holding back /approve to allow @leseb or @rhuss to take a look as they had previously looked at a completely different version

@leseb
Copy link
Copy Markdown
Collaborator

leseb commented Nov 10, 2025

@nathan-weinberg is this downstream yet?

@nathan-weinberg
Copy link
Copy Markdown
Collaborator Author

no

this commit adds an entrypoint shell script that aligns
with the upstream containers

it allows users to pass a custom run.yaml if they so
choose while keeping the official run.yaml we ship with
the distro image as a default

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
Copy link
Copy Markdown
Collaborator

@skamenan7 skamenan7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me with small nits.

if [ -n "$RUN_CONFIG_PATH" ] && [ -f "$RUN_CONFIG_PATH" ]; then
exec llama stack run "$RUN_CONFIG_PATH" "$@"
fi

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Nathan for this PR. few small nits.

If RUN_CONFIG_PATH is set but file doesn’t exist, container silently falls through to default config. Add explicit error or a clear message in the logs might help.

@@ -0,0 +1,12 @@
#!/bin/sh
set -e
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea to use instead to catch pipeline errors too.
set -euo pipefail

@leseb
Copy link
Copy Markdown
Collaborator

leseb commented Nov 13, 2025

no

Downstream first please!

@Elbehery
Copy link
Copy Markdown
Collaborator

no

Downstream first please!

is it urgent, or we wait till @nathan-weinberg is up & running ?

@nathan-weinberg
Copy link
Copy Markdown
Collaborator Author

no

Downstream first please!

There is considerably more testing here, so IMO it makes more sense to get consensus here where we have a better idea if the change will break the image

Will open downstream MR as well, but I want to get consensus/merge here first because maintaining/updating two branches concurrently is not a good use of time!

@mergify mergify bot merged commit ec0b301 into opendatahub-io:main Nov 13, 2025
6 checks passed
@nathan-weinberg nathan-weinberg deleted the entrypoint branch November 13, 2025 15:04
@nathan-weinberg
Copy link
Copy Markdown
Collaborator Author

varad-ahirwadkar pushed a commit to varad-ahirwadkar/llama-stack-distribution that referenced this pull request Apr 2, 2026
…s-1236

chore: Update EA2 wheel tags to 1236
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update container entrypoint to allow for dynamic startup command and custom run.yaml

8 participants